rasbt python-machine-learning-book
rasbt/python-machine-learning-book
Sebastian Raschka's new book, Python Machine Learning, has just been released. I got a chance to read a review copy and it's just as I expected - really great! It's well organized, super easy to follow, and it not only offers a good foundation for smart, non-experts, practitioners will get some ideas and learn new tricks here as well.
rasbt/python-machine-learning-book
Softmax Regression (synonyms: Multinomial Logistic, Maximum Entropy Classifier, or just Multi-class Logistic Regression) is a generalization of logistic regression that we can use for multi-class classification (under the assumption that the classes are mutually exclusive). In contrast, we use the (standard) Logistic Regression model in binary classification tasks. Now, let me briefly explain how that works and how softmax regression differs from logistic regression. As the name suggests, in softmax regression (SMR), we replace the sigmoid logistic function by the so-called softmax function?: Now, this softmax function computes the probability that this training sample x(i) belongs to class j given the weight and net input z(i). So, we compute the probability p(y j x(i); wj) for each class label in j 1, ..., k.
rasbt/python-machine-learning-book
That's an interesting question, and I try to answer this in a very general way. In essence, deep learning offers a set of techniques and algorithms that help us to parameterize deep neural network structures -- artificial neural networks with many hidden layers and parameters. One of the key ideas behind deep learning is to extract high level features from the given dataset. Thereby, deep learning aims to overcome the challenge of the often tedious feature engineering task and helps with parameterizing traditional neural networks with many layers. Now, to introduce deep learning, let us take a look at a more concrete example involving multi-layer perceptrons (MLPs).
rasbt/python-machine-learning-book
Let's assume we are really into mountain climbing, and to add a little extra challenge, we cover eyes this time so that we can't see where we are and when we accomplished our "objective," that is, reaching the top of the mountain. Since we can't see the path upfront, we let our intuition guide us: assuming that the mountain top is the "highest" point of the mountain, we think that the steepest path leads us to the top most efficiently. We approach this challenge by iteratively "feeling" around you and taking a step into the direction of the steepest ascent -- let's call it "gradient ascent." But what do we do if we reach a point where we can't ascent any further? I.e., each direction leads downwards?
rasbt/python-machine-learning-book
That's an interesting question, and I try to answer this in a very general way. In essence, deep learning offers a set of techniques and algorithms that help us to parameterize deep neural network structures -- artificial neural networks with many hidden layers and parameters. One of the key ideas behind deep learning is to extract high level features from the given dataset. Thereby, deep learning aims to overcome the challenge of the often tedious feature engineering task and helps with parameterizing traditional neural networks with many layers. Now, to introduce deep learning, let us take a look at a more concrete example involving multi-layer perceptrons (MLPs).
rasbt/python-machine-learning-book
Software engineering is about developing programs or tools to automate tasks. Instead of "doing things manually," we write programs; a program is basically just a machine-readable set of instructions that can be executed by a computer. Let's consider a classic example: e-mail spam filtering. Assuming that we have access to the source code of our e-mail client and know how to handle it, we could come up with an instinctive set of rules that may help us with our spam problem. For example: if not "sender in contacts": if "subject line contains BUY!: e-mail spam folder:" else if ... It is intuitive to say that coming up with these rules is a pretty tedious task.
rasbt/python-machine-learning-book
That's an interesting question, and I try to answer this is a very general way. The tl;dr version of this is: Deep learning is essentially a set of techniques that help we to parameterize deep neural network structures, neural networks with many, many layers and parameters. And if we are interested, a more concrete example: Let's start with multi-layer perceptrons (MLPs) ... On a tangent: The term "perceptron" in MLPs may be a bit confusing since we don't really want only linear neurons in our network. Using MLPs, we want to learn complex functions to solve non-linear problems. Thus, our network is conventionally composed of one or multiple "hidden" layers that connect the input and output layer.
rasbt/python-machine-learning-book
TensorFlow is more of a low-level library; basically, we can think of TensorFlow as the Lego bricks (similar to NumPy and SciPy) that we can use to implement machine learning algorithms whereas scikit-learn comes with off-the-shelf algorithms, e.g., algorithms for classification such as SVMs, Random Forests, Logistic Regression, and many, many more. TensorFlow really shines if we want to implement deep learning algorithms, since it allows us to take advantage of GPUs for more efficient training. To get a better idea of how these two libraries differ, let's fit a softmax regression model on the Iris dataset via scikit-learn: Now, if we want to fit a Softmax regression model via TensorFlow, however, we have to "build" the algorithm first. But it really sounds more complicated than it really is. TensorFlow comes with many "convenience" functions and utilities, for example, if we want to use a gradient descent optimization approach, the core or our implementation could look like this:
rasbt/python-machine-learning-book
Why did I bother writing this? Well, here is one of the most trivial yet life-changing insights and worldly wisdoms from my former professor that has become my mantra ever since: "If you have to do this task more than 3 times just write a script and automate it." By now, you may have already started wondering about this blog. I haven't written anything for more than half a year! Okay, musings on social network platforms aside, that's not true: I have written something -- about 400 pages to be precise. This has really been quite a journey for me lately. And regarding the frequently asked question "Why did you choose Python for Machine Learning?"
rasbt/python-machine-learning-book
There are two fundamental milestones I'd say. The first one is Fisher's Linear Discriminant [1], later generalized by Rao [2] to what we know as Linear Discriminant Analysis (LDA). Essentially, LDA is a linear transformation (or projection) technique, which is mainly used for dimensionality reduction (i.e., the objective is to find the k-dimensional feature subspace that -- linearly -- separates the samples from different classes best. Given the objective to maximize class separability, projecting the 2D dataset below onto "x-axis component," would be a better choice than the "y-axis component." Keep in mind though that LDA is a projection technique; the feature axes of your new feature subspace are (almost certainly) different from your original axes.